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Abstract 

Statistics of the Hurst scaling exponents calculated with the use of two methods: recently introduced 
Detrended Moving Average Analysis(DMA) and Detrended Fluctuation Analysis (DFA)are compared. Anal- 
ysis is done for artificial stochastic Brownian time series of various length and reveals interesting statistical 
relationships between two methods. Good agreement between DFA and DMA techniques is found for long 
time series L ~ 10^, however for shorter series we observe that two methods give different results with no 
systematic relation between them. It is shown that, on the average, DMA method overestimates the Hurst 
exponent comparing it with DFA technique. 



1 Introduction 

main problem discussed in the context of stochastic time series in various physical, biological, financial and 
economical processes is the presence of autocorrelations in data. One of the technique to check whether such 
autocorrelations are present in time series is based on the investigation of the fractal structure in time series and 

r -is related to the scaling exponent H, sometimes denoted also as a [3-0 and called Hurst exponent. It plays a 
significant role as the main concept upon which fiuctuations of a time series around its local trend (drift) are 
formed and it may be considered as the one of the crucial points responsible for 'genetic code' of time series of 

■»««^ various origin. For the purpose of mentioned above fractal analysis one can introduce the scaling exponent a 
as follows. 

^ [ Let x{t) (t — 1,...,L) is the time series defined for discrete time points t. By rescaling time axis 7 times 
I (e.g. enlarging it xlO"), one reveals the tiny structure of time series not visible for smaller resolution (7 ~ 1). 
■ O The fractal structure of the series comes from the relation: 

o . 

: x'it') ^ Txij-h) ^ x[t) (1) 

^> ■ 

k>( where ~ means similarity correspondence. 

; I ' The above formula indicates that the magnitude of rescaled time series x(7^^i) should be simultaneously 
increased F times in order to satisfy full (local) equivalence of x{t) and x'{t') series. 

It turns out that the scaling factor F can be expressed in terms of time rescaling factor 7 with the use of 
Hurst-Hausdorff a exponent [a > 0): 



F = 7" (2) 

The commonly accepted methods to measure a exponent are Rescaled Range Analysis (R/S), spectral 
density analysis @], and Detrended Fluctuation Analysis (DFA) [S]. Recently, new method called Detrended 
Moving Average (DMA) has also been proposed |S1IZ|. In this article we will focus on the latter two methods due 
to large uncertainties in spectral density analysis and problems with R/S predictions in nonstationary series. 
Searches for better understanding how results of these two methods relate to each other are in progress I7|-|H1. 
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A DFA method was first developed for biological purposes J_ and then applied also to finances dlj-J^l- It is 
a detrendisation technique basically measuring fluctuations of a given time series around its local trend as a 
function of the trend length. Let us recall the main steps of this method: 

1. A given signal x{t) (t — 1, L) of time series is divided into L/t not overlapping boxes of length r each. 

2. A polynomial fit Xr.k is constructed in each box representing the local trend in that box, where k is the 
order of polynomial fit. 

3. A detrended signal Xr,k{t) is found: 

XrMt) ^ X{t) ~ Xr,kit) (3) 

and then its fluctuation (standard deviation)i^£)i?^(T, fc) is calculated 

/ L XV2 
FDFA{T,k)=i-Y,Xl,{t)\ (4) 

4. From the basic differential stochastic equation of the time series x{t) with a local drift and a local 
dispersion a{t) 

dx{t) = ^l{t)dt + a{t)dX{t) (5) 

one expects the power law behavior: 

Fz5F^(T,A:)^r"W (6) 
where a{k) is the searched Hurst exponent. 
The last equation enables to calculate a exponent directly from log- log linear fit: 

\ogFDFA{T,k) - a{k)\ogT (7) 

It can be proved that a{k) depends very weakly on k |12l I13j so in most application one takes linear function 
[k = 1) as a good candidate for Xr,k- This approach will also be used in our paper. 

It turns out that the bigger a the more 'quiet' time series is, i.e. a signal fluctuates in a more correlated 
way. In fact, for < a < 1/2 we have negative autocorrelations (antipersistence) in time series. On the other 
hand, if 1/2 < a < 1, there are positive autocorrelations (persistence) in signal. The case a = 1/2 corresponds 
to completely uncorrelated signal, so called integer Brownian walk. An existing link between a exponent and 
the probability that a given trend will last in the immediate future if it did so in the immediate past gives an 
additional hint about trend changes forecast possibility [Tl| . 

A Detrended Moving Average (DMA) technique looks very similar to DFA. The main difference one meets 
here is that instead of linear or polynomial detrendisation procedure in equally sized boxes, one uses moving 
average of a given length A. The basic steps of DMA analysis are then: 

1. A simple moving average of length A (A = 1, L) is constructed for x{t) series [t > A): 

A-l 
k=0 

2. A detrended signal is found similarly to Eq. (jj)): 

X^it) = x{t) - {x{t))^ (9) 
and its fluctuation within a window of size A reads now: 

1/2 

(10) 



3. Similarly to DFA a power law should be observed 

log FdmaW ^ alogX (11) 

where a is the searched scaling Hurst exponent. 

The DMA technique is less complicated and seems to be faster in practical application than DFA algorithm. 
However, so far no final clear conclusion has been reached regarding mutual relationship between DFA and 
DMA results for the same series. This article contributes to the above area of interest. 



2 DMA-DFA Comparison Study 

Preliminary results obtained for some real financial series suggest that ajjMA values are lower than corre- 
sponding ajjpA results. It seems to be confirmed for the set of artificial time series of length L ~ 2^^* constructed 
with the use of Random Midpoint Displacement (RMD) algorithm where one finds udfa ^ ctDMA + 0.05 
This supports the existence of systematic displacement between DFA and DMA results, at least for longer series. 
In many practical applications however, the length of time series we deal with is shorter (e.g. finance, biology, 
genetics, medicine), especially if one looks at the local a exponent value rather than the global one 

To attack the problem of mutual dependence between DMA and DFA results for series of various length, let 
us first look at the set of artificial arithmetic integer Brownian time series of length L = 3 x 10'* with discrete 
time interval At — 1, i.e.: 

L 

x{LAt) = xo + ^ Axk (12) 

k=l 

where Aa;^ (k — I, L) are centered and normalized displacements generated by random number generator. 

Two cases with opposite relation a dm a vs a up a are shown in Fig. 1. In the first case a^pA > an ma 
and aoFA — oidma ~ 0.02, in the other one aoFA < oldma and aoMA — oldfa — 0.04. Thus no systematic 
relationship is produced. 

L = 30 000 
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Figure 1: Examples of DFA and DMA a exponent fit for artificial Brownian time series of length L = 30000, 
where {a)aDFA > ocdma and {h)aDFA < Q-dma 

This induces to treat the problem statistically, i.e. one should find statistical distributions of Hurst exponents 
measured within two methods for artificial series of various length. It seems to be interesting to compare two 
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statistics and to work out correlations between scaling exponents measured within DMA and DFA techniques 
for the same sample of time series. 

For this purpose we took samples of arithmetic Brownian time series of length L in the range 10^ — 10^. 
Each sample contained N ~ 65000 series of fixed length. We tried to cover uniformly the whole range of L in 
log-scale keeping L ^ Log" with the approximate log step q ~ 7/4 to create variety of lengths. 

For any sample of fixed length series the averaged scaling range (r) or (A) has been calculated for defined 
number of candidates (~ 30) and the corresponding standard deviation cr^ (ca)- The scaling range was taken as 
the range of r or A variables strictly obeying scaling laws of Eqs. l(7l). Hll|l and assumed to terminate respectively 
at (r) — CTt- for DFA and {X)—a\ for DMA. Only series with regression statistical correlation coefficient > 0.98 
were taken into account for a exponent extraction. For any sample of time series a statistical distribution of 
ci:dfa and aoMA frequencies has been built. 

The full range of obtained distribution results is shown in Fig. 2-9. The first observation one makes is that for 
any length L both distributions fit very well normal distributions, but with different parameters for the gaussian 
curve. We made all plots also for centered and normalized a frequencies in semi-log scale (Fig.2(b,c)-9(b,c)). 
Only small deviations from the normal distribution are observed in tails - basically due to smaller statistics 
there. A good correspondence with gaussian curve is confirmed also in Kolmogorov and Anderson-Darling tests, 
whose results are displayed in Table 1 and shown for chosen lengths L in Fig. 10. 



Table 1: Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) test results for distribution of fluctuations in 
a exponent as a function of the method (DFA, DMA) and the length L of time-series. 
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One may notice that the standard deviation (Tdfa of old fa scaling parameters is always smaller than the 
corresponding standard deviation a dm a of oluma exponents, and both standard deviations decrease when L 
grows. This can be explained in terms of different sensitivity of DFA and DMA techniques to the presence 
of random autocorrelations in time series. Such autocorrelations are naturally randomly distributed in any 
sample of generated time series and hence a distribution of a exponent is normal. The probability of random 
autocorrelations is bigger for short time series, where all statistical fluctuations manifest in a more vivid way. 
When L increases, their influence on the presumed global autocorrelation in series can be neglected. Therefore, 
both standard deviations udfa and (Tdma drop with increasing L. However, we always observe udfa < ctdma, 
what indicates that DMA technique is more sensitive to such "autocorrelation noise" than DFA one. 

One may look at this problem also from another side - like in Fig. 11. Here we have drawn several plots of 
DFA and DMA analysis, i.e. InF vs Inr or In A for several corresponding artiflcial Brownian series of length 
L = 1000. It is seen that deviations from the strict power law behavior, if occur, are more drastic for DMA 
than for DFA case and the dispersion of produced slopes is also larger for DMA than for DFA, despite the fact 
that DMA plots are more smooth in comparison with DFA ones. 

The next observation concerns the mean values. One gets {oiDFA)N < (ctDMA)N for all L, where (.)]\[ is 
taken over a sample of N time series. A clear shift of the central DMA values to the right with respect to DFA 
ones (see Figs. 2-9(a)) does not suggest however the presence of systematic relation between aoMA and aoFA- 
Indeed evaluating the correlation coefficient (values are shown in the description of Fig. 2-9 (a)): 



{aDFAOtDMA)N — {o.DFa)n{oldMa)n 

corr{aDFA,aDMA) ^ (13) 

(TDFACTDMA 

one finds it increasing with L, but it never indicates the full correlation. Its value is maximal for large L, where 
corr{aDFA, cudma) ~ 0.8 for L ~ 10^ — 10^. 

This situation is graphically illustrated in Fig. 12, where a correlation plot aoFA vs aoMA is shown for 
Hurst exponent values obtained for L = 3000, L = 10000 and L = 30000 series. From the asymmetry of plots 
against diagonal one notices that DMA gives higher values than DFA method in most series. This result is 
independent on the length of time series. In fact the percentage excess of cases n^, where aoMA > oidfa over 
the cases where auMA < oidfa ("■-)j i-e.: 



(14) 
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Figure 2: (a)Distribution of scaling a exponent obtained with the use of DFA (circles) and DMA (squares) 
techniques for the sample of 65000 series of length L = 600. The normal distribution fit with corresponding 
parameters is also shown as a solid line. (b)(c) The same plots for DFA(b) and DMA(c)in semi-log scale for 
normalized and centered a exponents. 
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L = 1 000 
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Figure 3: (a) Distribution of a exponents for series with L = 1000. (b)(c) Corresponding plots in semi- log scale. 
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Figure 4: (a) Distribution of a exponents for series with L = 1800. (b)(c) Corresponding plots in semi- log scale. 
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Figure 5: (a) Distribution of a exponents for series with L = 3000. (b)(c) Corresponding plots in semi- log scale. 
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L = 6000 
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Figure 6: (a)Distribution of a exponents for series with L = 6000. (b)(c) Corresponding plots in semi- log scale. 
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Figure 9: (a) Distribution of a exponents for series with L — 30000. Additional lines represent L = 1000 normal 
fit drawn for comparison in the same scale. (b)(c) Corresponding plots in semi-log scale. 
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L = 20 000 
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Figure 10: Kolmogorov-Smirnov (a)(c)(e) and Anderson-Darling (b)(d)(f) tests of correspondence between 
obtained distributions and the Gaussian one drawn respectively for time series of length L = 1000, L = 3000, L = 
20000 
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Figure 11: Examples of DMA and corresponding DFA plots InF vs In r (In A) for several randomly chosen 
Brownian integer time series of length L = 1000. 
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Figure 13: A mean difference Adfa-dma between aoFA and aijMA exponents calculated for the same series 
as a function of the series length L. Marked error bars S Adfa-daia SaopA + ^c^dma correspond to 
uncertainties in slope determination in regression analysis for both methods. 



changes from ~ 20% — 25% for series with L < 10000 up to ~ 50% for longer series. 
It is obvious therefore that the mean of difference Sdfa-dma, where 

SdFA-DMA — {oi-DFA — OiDMA)N (15) 

is not a good measure of 'distance' between two investigated methods. It is more convenient to define this 
distance in a standard way, i.e.: 

1/2 

^DFA-DMA = {{{aOFA - C^DMA f)Ny'^ (16) 

The sufficient number of time series samples of various length has been worked out to find a relationship 
^dfa-dma{L). The polynomial best fit for the collected data is drawn in Fig. 13 with error bars coming from 
the uncertainties in slope determination. This plot indicates that the average displacement between aoFA and 
OiDMA exponents for a given time series ranges from 15% for series with L < 10"^, down to 2% for long series 
(L ~ 10^). The latter value is much smaller than one reported in The fastest drop in DFA-DMA distance is 
observed for medium length series, i.e. when L ~ 10^ - 10''. For such series Ajjfa-dma makes on the average 
~ 10% of aoFA value. 

This might be of interest if more detailed study of a exponent is required for more exact predictions to be 
made(e.g. heart diseases, finances, etc.). The plot in Fig. 13 may also suggests that Ajjfa-dma —>■ when 
L — > oo. The latter case has not been explored in details. 



3 Conclusions 

We report from the analysis of artificial Brownian integer time series and from the collected data that, on the 
average, DMA method overestimates Hurst exponent values in comparison with DFA technique. This result 
contradicts to some previous hypothesis in literature. The DMA method seems to be also more sensitive to the 
presence of random fluctuations in autocorrelations in time series than DFA analysis does. In many practical 
situations, especially for shorter series, it might be a disadvantage leading to the false signal of not really exist- 
ing, global autocorrelations in time series. 

The mean distance between two methods, i.e. the mean difference between aoFA and aoMA exponents cal- 
culated for the time series of given length L is a decreasing function of L. For shorter series (L < 6000) this 
distance reaches ~ 15% what might be important in precise determination of a exponent for such series. 
There are some open questions. It is not exactly clear where the scaling law exactly starts or terminates, so one 
needs a more strict requirements how the scaling range should be determined for DFA and DMA techniques 
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and how uncertainties in the choice of scahng range are related to uncertainties in the scahng exponent a. This 
work is now in progress jl5| . 
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